home *** CD-ROM | disk | FTP | other *** search
- Digitized Voice Programmer's Toolkit for the PC
- -----------------------------------------------
-
- Version 1.0
-
- Copyright (c) 1988,1989, Farpoint Software
-
- * * * * * * * * * * * * * * * * *
-
-
- **************************************************************************
- * *
- * To those of you who have HIDI.ARC and/or DIGITS.ARC, welcome back. *
- * This new release will serve as a major upgrade to things you already *
- * have. *
- * *
- **************************************************************************
-
-
- Introduction
- ------------
-
- This toolkit is a combination of software and hardware designed for the
- purpose of mechanizing and simplifying the process by which programmers may
- create digitized voice recordings, store them on disk, edit the voice data
- files, and incorporate digitized voice playback into their own high-level
- language programs.
-
- The recording of digitized voice requires a small, inexpensive hardware device
- to be built. Schematics and printed circuit board layout files are provided
- for this device.
-
- Playback of the digitized voice, however, requires NO SPECIAL HARDWARE. The
- sound is produced with the built-in speaker provided in nearly all PC's and
- PC-compatible machines. This means that programs may be written for general
- distribution which will play voice messages on the user's machine as it
- exists.
-
- Here is a list of the major features of the current software package:
-
- (1) Operates under the DOS environment.
- (2) Provides a full set of voice record/playback control routines which
- are directly callable from many high-level languages including C
- and Pascal. They are also of course callable from assembly language.
- (3) All voice operations proceed IN THE BACKGROUND. The control routines
- return to the caller immediately, and voice playback occurs
- simultaneously with the continuing execution of the main program.
- The main program may call a status routine at any time to check on
- the progress of the voice playback.
- (4) There are no length limitations on either the size of the memory
- buffers or the size of the voice data files on disk other than the
- physical limits of the machine itself. 64k is not a special number.
- (5) A sophisticated voice data file editor is provided. This gives the
- programmer a set of capabilities similar to those available on a
- conventional tape recorder. Position markers, live overwriting,
- selective erasure, cut-and-paste, and assorted other features make
- the produciton of "refined" voice files an easy task.
- (6) Several short example programs are included, written in both C and
- assembly language, which demonstrate the use of the calls to the
- voice modules. There is even an example of a memory-resident program
- which detects the pressing of the left shift key and plays a short
- voice message when this occurs. (Foreground processing continues
- undisturbed.)
-
-
- Shareware Notice
- ----------------
-
- The Digitized Voice Programmer's Toolkit is released as Shareware. This is
- copyrighted material; it is NOT "free software". You are permitted to
- experiment with this package long enough to determine if it suits your needs,
- but if you will be making use of the material in your own programs, then a
- license fee of $50 is required. NO PROGRAM WHICH MAKES USE OF THE MATERIALS
- IN THIS TOOLKIT MAY BE SOLD COMMERCIALLY OR ON A CONTRACT BASIS UNLESS THE
- SELLER HAS PAID THE LICENSE FEE. Please make the check or money order payable
- to:
-
- Farpoint Software
- 2501 Afton Court
- League City, Texas 77573
-
- For convenience, a registration form is included in the file REGISTER.FRM.
-
- As a registered user, you will receive updates automatically long before they
- are released to BBS's. You will also receive a copy of the source code to the
- VDFE editor. Registered users, of course, are given higher priority if
- programming assistance or hardware construction assistance is requested.
-
- You are granted permission to distribute copies of the Digitized Voice
- Programmer's Toolkit, provided that (1) no fee is charged for such copies,
- other that a nominal disk duplication fee, (2) these files are distributed
- in their original, unmodified form, and (3) ALL the files in the original
- archive are included with each copy. (See "List of Files" below.)
-
- If you paid a "disk duplication fee" or other such fee to a distributor of
- public domain and shareware programs, be aware that the payment of this fee
- DOES NOT constitute registration of this Toolkit. Likewise, the payment of a
- fee to any Bulletin Board Service for the time required to download this
- Toolkit DOES NOT constitute registration. Registration occurs only through
- direct interaction with Farpoint Software.
-
- If more information is needed, write or contact Alan D. Jones through
- Compuserve Information Service at user ID 74030,554.
-
-
- List of Files
- -------------
-
- The files included with the Digitized Voice Programmer's Toolkit are:
-
- BIN2ASM
- BIN2ASM.C
- BIN2ASM.EXE
- EMBEDDED
- EMBEDDED.C
- EMBEDDED.EXE
- EVM.PRE
- EVM.SUF
- EVM.VOI
- LONGTEST.VOI
- README.1ST
- REGISTER.FRM
- RUN_ME.BAT
- TSR
- TSR.ASM
- TSR.EXE
- TSRVM.PRE
- TSRVM.SUF
- TSRVM.VOI
- VDFE.EXE
- VMSCH.HPP
- VOICEKIT.DOC
- VPMOD.ASM
- VPMOD.DOC
- VPMOD.H
- VPMOD.OBJ
- VPTEST
- VPTEST.C
- VPTEST.EXE
- VRMOD.ASM
- VRMOD.DOC
- VRMOD.H
- VRMOD.OBJ
- VRTEST
- VRTEST.C
- VRTEST.EXE
-
- If you received the Toolkit with any of the above files missing, please
- notify Farpoint Software.
-
-
- Description of Voice Subroutine Modules
- ---------------------------------------
-
- The key software elements in the kit are two assembly language programs,
- VRMOD.ASM and VPMOD.ASM, and their assembled OBJ files. These are not stand-
- alone programs. They are designed to be linked with other programs to provide
- the voice control routines. The calls associated with recording are in
- VRMOD, and the calls associated with playback are in VPMOD. Any given program
- may be linked with either or both of these modules. Typically, a program
- designed for general distribution would be linked only with VPMOD, since
- recording requires the hardware device.
-
- The external hooks to the two modules consist of various "public" procedure
- names. All procedures use the Pascal calling convention, since most high-level
- language compilers can support this calling method. The Pascal calling
- convention has the following meaning:
-
- (1) Procedure names are all caps, and are not preceeded by an underscore.
- (2) Procedures are called with "far" (intersegment) calls.
- (3) Short return values appear in the AX register; long return values
- appear in DX:AX.
- (4) Parameters are pushed onto the stack in left-to-right order; i.e. the
- first parameter in the list is pushed first. If the parameter is a
- doubleword, then the high order word is pushed first.
- (5) The called subroutine is responsible for clearing the parameters from
- the stack upon return.
-
- The above list will be of interest primarily to assembly language programmers.
- When working in a high-level language, it is necessary only to make sure that
- the compiler is using the proper calling method. For C programs, two header
- files have been included. They are VRMOD.H and VPMOD.H. At the beginning of
- any C program which is to use the voice playback routines, insert the line:
-
- #include "vpmod.h"
-
- This file contains prototypes of all procedure calls in VPMOD.ASM, declared
- in a way that causes the compiler to generate correct calling code.
-
- The details of how each individual procedure call operates will be found in
- the separate documents VRMOD.DOC and VPMOD.DOC. It is suggested that you
- print these files for use as reference material while writing programs.
-
- It is possible to link both VRMOD.OBJ and VPMOD.OBJ to the same program, but
- you should NOT have both packages initialized at the same time. Each package
- assumes "ownership" of timer channel zero, and this would cause a conflict
- over the setting of the hardware timer interval, not to mention the problem
- of possible insufficient CPU time to execute both interrupt routines at every
- timer tick (at 16500 Hz). The solution here is (1) never attempt to record and
- play back at the same time, and (2) don't call PVOICE_INIT until playback is
- ready to begin and be sure to call PVOICE_CLEANUP immediately after playback
- ends. (Similar rules apply to recording.)
-
-
- Example Programs
- ----------------
-
- Note: "Make" files acceptable to Microsoft's Make utility are included
- for all the example programs. The compiler used was the Microsoft
- C Compiler version 5.10. The assembler was the Microsoft Macro
- Assembler version 5.10. The make files are written to assume that
- the compiler is installed to include the Large model library and
- that the default operating system is DOS. If the compiler defaults
- to the OS/2 operating system, then change the make files so that
- all occurrences of "llibce" become "llibcer".
-
- VRTEST.C (VRTEST.EXE):
- [Related files: VRTEST]
- This program works like RECORD.COM provided with the first voice
- digitization package released in 1988. It demonstrates the use of
- all the procedure calls and features in VRMOD. To execute the program,
- first attach the voice recording circuit to a COM port, then at the
- DOS prompt type: VRTEST 1 TESTFILE.VOI. If you are using COM2, then
- substitute "2" for the "1". The filename "TESTFILE.VOI" may be any filename.
- Recording will begin and messages will scroll on the screen indicating the
- number of bytes of data recorded. Writing to the file will be performed
- "on the fly". The memory buffer size is currenly set to 16k, but may be
- changed by editing and recompiling the program. Recording will continue
- until either the <Esc> key is pressed or the disk is full. The size of the
- memory buffer should be at least 8k, but beyond this point it is actually
- irrelevant as long as calls to RVOICE_CATCHUP are made frequently enough
- (which means at least once every 3 seconds).
-
- VPTEST.C (VPTEST.EXE):
- [Related files: VPTEST]
- This is the counterpart to VRTEST. It demonstrates the use of all the
- procedure calls in VPMOD. As in VRTEST, the memory buffer is currently 16k
- but may be changed by editing and recompiling. The command line to execute
- the program is VPTEST TESTFILE.VOI, where "TESTFILE.VOI" is the name of
- a file containing voice data. The reading of the file will occur as needed
- to keep the buffer full or until all bytes have been read. The size of
- the memory buffer needs to be increased beyond 8K only if it is not possible
- to call PVOICE_CATCHUP at least once every 3 seconds. (Note that it may
- also be advisable to increase the buffer size if the file is being read
- from a floppy disk, since accesses may be quite slow.)
-
- EMBEDDED.C (EMBEDDED.EXE):
- [Related files: EMBEDDED, EVM.VOI, EVM.PRE, EVM.SUF]
- This is a simple example of the techniques used to embed voice data in an
- executable program. Instead of reading a separate voice file, the voice
- data is part of the EXE file. Note that the "make" file in this case is
- as important to study as the C program. The trick here is to convert the
- raw binary voice data file into an OBJ file that we can feed through the
- linker. This is done in three stages: (1) The file-cruncher program BIN2ASM
- is used to create a file containing only a long list of assembly language
- DB statements equivalent to the binary data; (2) The prefix file EVM.PRE
- and the suffix file EVM.SUF are combined with the DB statements to form
- an assembly language module containing all necessary segment brackets and
- public declarations; (3) This module is assembled and linked with the main
- program. The content of the prefix and suffix files depend on the specific
- application; in this example we use only a single segment and a single
- block of voice data. A more complex program may contain several modules of
- this type or have an assortment of labels within a single module. Since the
- assembler requires segments to be 64k or less, BIN2ASM places a marker
- comment (a semicolon and a string of minus signs) at each 64k boundary in
- its output file. If this happens, you must edit the file to end a segment
- and begin a new one at each of these boundaries.
-
- TSR.ASM (TSR.EXE):
- [Related files: TSR, TSRVM.VOI, TSRVM.PRE, TSRVM.SUF]
- This serves as both an example of a pure assembly language program using
- VPMOD and a technique for including voice playback in a memory-resident
- program. The voice data is embedded in the EXE file in the same way as it
- was done in EMBEDDED.EXE above. Otherwise, the program is fairly
- conventional. There is one major caution to observe, however: since a
- memory-resident program may play voice concurrently with the execution of
- another unknown program, don't set the file read flag (in PVOICE_START)
- to 1 and don't use PVOICE_CATCHUP! Use of the "read-on-the-fly" feature
- of the voice control routines calls DOS to read the disk. If a DOS call
- is made within an interrupt service routine (especially a timer tick
- routine), the interrupt may have occurred while a DOS call was already in
- progress. In this case, DOS will be "re-entered", and it is NOT re-entrant.
- Doing this will almost certainly cause a system crash.
-
- If you are already familiar with the above problem, and have worked out a
- system of calling DOS in the background during its "safe" moments, then
- you probably will be able to use read-on-the-fly. Always call PVOICE_START,
- PVOICE_INIT, PVOICE_CLEANUP, and PVOICE_CATCHUP during "safe" times. Also,
- remember that timer interrupts will now be happening at about 16500 Hz, so
- make sure that your program never disables interrupts for more that a very
- short time. (One more thing: if you must hook INT 8, do it BEFORE calling
- PVOICE_INIT.)
-
-
- The Voice Data File Editor (VDFE)
- ---------------------------------
-
- This program provides a convenient environment for creating, editing, and
- generally patching together voice data files. Its function resembles that of
- a tape recorder. It edits files only within its RAM buffer, which is limited
- by the amount of memory on the machine available to DOS. On a 640k machine,
- this translates to about 470k of buffer space, or 225 seconds (3 minutes and
- 45 seconds) of continuous sound. If you need to edit nonstop chunks of voice
- data longer than that, they will have to be edited piecemeal and concatenated
- afterward. (Of course, multi-megabyte voice data files may be recorded using
- VRTEST or a similar program. If it turns out that people really need to edit
- super-long files on a regular basis, I will include infinite-file-length
- editing on a future release.)
-
- VDFE requires no command line parameters. Upon execution, it displays its
- primary screen and waits for user input. This consists primarily of single
- keystroke commands, which are hereby documented in some detail:
-
- <Up arrow> and <Down arrow>:
- These are used to scroll the contents of the Operating Instructions window
- in the lower right area of the screen. The window displays one-line
- descriptions of all the keystroke commands.
-
- <F1>:
- Displays an information screen which briefly describes the purpose and
- operation of VDFE.
-
- <Esc>:
- Exits to DOS. If the contents of the editing buffer have been altered since
- the last save to disk, the user is asked to confirm the exit command.
-
- <F2>:
- Increments the COM port number shown at the left side of the screen. This
- will be the port used for recording. Press <F2> repeatedly until the desired
- port number shows.
-
- <F3>:
- Requests a file name, then loads the file into the edit buffer starting at
- offset zero. The end-of-file position will be set to match the length of the
- file. If the specified file does not exist, the user will be asked whether
- to create the file. If the answer is "yes", then a zero-length file is
- created and the end-of-file position is set to zero. The actual data in the
- edit buffer remains unchanged.
-
- <Alt F3>:
- Requests the entry of a new file name. This becomes the current file name as
- shown at the left side of the screen. Nothing is done with this name
- immediately. The new file name will be used in subsequent "save current
- data" (<F4>)operations.
-
- <F4>:
- Saves current data. The current filename is opened and truncated, and the
- contents of the edit buffer from offset zero to the offset shown as the
- end-of-file are written to the file.
-
- <Space bar>:
- The "Stop" button. If a record or playback operation is in progress, it is
- stopped.
-
- <Enter>:
- The "play" button. The contents of the edit buffer are played back through
- the speaker starting from the current position. Playback ends at the
- end-of-file position. If the current position is greater than or equal to
- the end-of-file position, playback will not occur.
-
- <Insert>:
- The "record" button. Digitized voice is input through the selected COM port
- and written into the edit buffer. Writing begins at the current position,
- overwriting existing data. Recording can be stopped by pressing <Space>,
- <Enter>, or any key which normally has the function of changing the current
- position. If the current position during recording exceeds the end-of-file
- position, then the end-of-file position is moved forward continuously to
- match the current position. If the current position reaches the end of the
- edit buffer, then wrap-around will occur, causing recording to continue at
- offset zero.
-
- <Left arrow>:
- Medium-speed rewind. The current position will be decremented by 256, which
- corresponds to about 1/8 second of voice time.
-
- <Right arrow>:
- Medium-speed forward. The current position will be incrememted by 256.
-
- <Ctrl left arrow>:
- Fine rewind. The current position will be decremented by 1 byte.
-
- <Ctrl right arrow>:
- Fine forward. The current position will be incremented by 1 byte.
-
- <Page Up>:
- High-speed rewind. The current position will be decremented by 8192, which
- corresponds to about 4 seconds of voice time.
-
- <Page Down>:
- High-speed forward. The current position will be incremented by 8192.
-
- <Home>:
- The current position is set to zero.
-
- <End>:
- The current position is set to match the end-of-file position.
-
- <Ctrl end>:
- The end-of-file position is set to match the current position.
-
- <0> through <9>:
- Set marker. There are 10 markers, numbered 0 through 9. Each marker consists
- of a slot in which a "current position" may be stored. Any time a digit key
- is pressed, regardless of the stopped/playing/recording state, the current
- position at that instant is copied into the corresponding marker. The marker
- values are displayed in a window in the lower left area of the screen.
-
- <Alt 0> through <Alt 9>:
- Pressing a digit key (on the main section of the keyboard, NOT the numeric
- keypad) while holding the <Alt> key down causes the current position to
- change to match the value stored in the corresponding marker.
-
- <F5> and <F6>:
- Change the marker numbers which are assigned the "begin" and "end" flags.
- In the left column of the marker window, two of the marker number positions
- always contain 'beg' and 'end' rather than a digit. These are the ones used
- in any operation that refers to a "marked section". Initially, marker 0 is
- the "begin" marker and marker 1 is the "end". Press <F5> repeatedly to move
- the 'beg' to the desired marker. Press <F6> repeatedly to move the 'end' to
- the desired marker. The two flags are not allowed to be assigned to the same
- marker.
-
- <Tab>:
- Sets the current position to match the "begin" marker and initiates a
- playback operation which will terminate at the "end" marker.
-
- <F7>:
- A filename is requested from the user. The contents of the marked section of
- the edit buffer are written to this file. If the file already exists, it
- will be overwritten. The current filename remains unchanged.
-
- <F8>:
- A filename is requested from the user. The contents of this file are copied
- into the edit buffer starting at the "begin" marker. The "end" marker is
- changed to reflect the size of the file. The current filename remains
- unchanged.
-
- <F9>:
- The marked section will be erased (filled with zeros).
-
- <F10>:
- This causes the editor to enter a mode in which text may be typed into the
- column of the marker window titled "comments". These are simply reference
- notes and have no effect on the operation of the editor. The comment entry
- mode is exited by pressing the <Esc> key.
-
-
- Graphical Print Files
- ---------------------
-
- These files are prepared for output to an HP LaserJet Plus printer with the
- minimum memory configuration (512k). To print one of the files, use
- "COPY /B <filename> LPT1:" (or LPT2 if appropriate). The following lists the
- contents of each file:
-
- Filename Density Description
- -------- ------- -----------
-
- VMSCH.HPP 150 dpi The schematic to the Digitizer.
-
- VMPCB.125 300 dpi A positive print of the "copper side" of
- a single-sided circuit board implementing
- the Digitizer, suitable for photo-reduction
- to board manufacturing negatives. Scale is
- 1.250, producing the largest image that will
- fit in the LaserJet 512k memory.
-
- VMSLK.125 300 dpi A positive print of a silkscreen component
- placement guide for the component side of
- the board. This may be either silkscreened
- onto the board or simply printed out and
- referred to while building the board. Scale
- is 1.250.
-
- VMDRL.125 300 dpi A drilling guide for use in making numeric-
- control tool tapes with a digitizing pad.
- This print will not be of much use to those
- who will be drilling the holes by hand.
- Scale is 1.250.
-
- VMPCB.100 300 dpi A duplicate of VMPCB.125, but scaled 1:1 for
- use with contact-print or direct transfer
- methods of producing the negatives.
-
- VMSLK.100 300 dpi A duplicate of VMSLK.125, scaled 1:1.
-
- VMDRL.100 300 dpi A duplicate of VMDRL.125, scaled 1:1.
-
- Due to the large size of the printed circuit board files, and the probability
- that most users will not actually want to manufacture a board for this
- device, these files are placed in a separate archive. Only the schematic,
- VMSCH.HPP, is included in this archive.
-
- All of these plots are available to registered users formatted for output on
- a variety of other printers and pen plotters (photoplotters also). Contact
- Farpoint Software at the address / CIS number shown in the Shareware Notice
- section of this document.
-
-
- Schematic Notes
- ---------------
-
- The circuit is designed to operate from two 9-volt batteries connected to J1
- and J2. The original circuit used a single-ended supply. This modification
- requires fewer parts and produces the correct RS-232 voltages at the output.
-
- Pad resistors have been added to the trimpot. This control in the original
- version was somewhat difficult to adjust. The pad resistors decrease the
- sensitivity of this control enough to allow a 1-turn potentiometer to be used,
- thus reducing the length of the "hunt" for the proper position.
-
- If your serial port uses a DB-9 connector, the cable from J4 is:
-
- J4 pin 1 -------- DB-9 pin 5 (Ground)
- J4 pin 2 -------- DB-9 pin 8 (CTS)
-
- If your serial port uses a DB-25 connector, the cable from J4 is:
-
- J4 pin 1 -------- DB-25 pin 7 (Ground)
- J4 pin 2 -------- DB-25 pin 5 (CTS)
-
- The circuit consists of two stages of voltage amplification with some
- high-pass filtering built into the coupling capacitors, followed by a
- differentiator. The output of the differentiator is fed to a voltage
- comparator, thus producing an output which has approximately the following
- relationship to the input from the microphone: If the derivative of the
- speech waveform if positive, then the output is logic zero; If the derivative
- of the speech waveform is negative, then the output is logic one. The
- transition timing at the output is entirely analog in nature; there is no
- synchronizing clock signal anywhere in the circuit.
-
- If the output of this circuit is connected directly to a speaker, the
- resulting sound will still be an understandable version of the input. Since
- the output consists of nothing but a digital bit stream, the job of the
- computer becomes that of simply recording and accurately reproducing this bit
- stream.
-
- The trimpot at the input of amplifier U3 is used to set the DC idle voltage
- output from the differentiator to somewhere near the threshold of comparator
- U4. There will be a considerable amount of noise at the output of U3,
- originating at the microphone and within the input circuitry of U1, and
- highly amplified by U1 and U2. The trimpot should be adjusted so that the
- comparator threshold is just outside the normal excursion of the noise signal
- ("off to one side"), otherwise "silence" at the microphone will become, at
- the speaker output from the computer, a loud hiss with a strong component at
- half the sampling frequency.
-
- I used LF356's for U1, U2, and U3, and an LM393 for U4. All amplifiers should
- have power supply bypass capacitors (not shown). The microphone is a 600 ohm
- dynamic type. The +-12 volt power supply should be quiet and well-regulated;
- the one in the PC is too noisy unless you use heavy filtering. Power supply
- bypassing consists of attaching capacitors in the 0.1 uF range (up to 1 uF is
- ok) DIRECTLY across the power supply pins of each amplifier chip. Layout is
- important here. The capacitors should use the shortest possible wire length
- to the pins of the chips. There will be 8 caps required: one from +12 to
- ground and one from -12 to ground for each chip. If you use dual or quad
- amplifier chips instead of the LF356's, then of course only one set of caps
- is required per actual chip. The purpose of the bypass caps is to provide a
- highly localized low-impedance power source at each chip to prevent unwanted
- positive feedback through the power leads (feedback between different chips).
-
-
- Comments on the Digitization Technique
- --------------------------------------
-
- The speaker on the PC and its associated driver circuitry is quite simple and
- crude, having been designed primarily for creating single square-wave tones
- of various audio frequencies. This speaker is typically driven by a pair of
- transistors used as current amplifier which is in turn driven directly by the
- output of a TTL gate. This results in only two possibilities of voltage
- across the voice coil: 0 volts and 5 volts. Any sound to be reproduced by
- this system must be reduced to an approximation in the form of a stream of
- constant-amplitude, variable-width rectangular pulses.
-
- Examination of a speech waveform on an oscilloscope display quickly tells us
- that it is not going to be possible to even remotely mimic this waveform
- under the above restrictions. Much of the information contained in the
- waveform is in the form of amplitude variations, and this is the one
- attribute we cannot reproduce. It is initially tempting to try to use the
- technique of the "class D" amplifier to create the waveform, using high-speed
- pulse width modulation and depending on the mechanical characteristics of the
- speaker and those of the human ear to provide the missing low-pass filtering.
- Assuming the sampling rate to be 8 KHz (based on the Nyquist criterion) and,
- to conserve memory, assuming the samples to contain only 4 bits of amplitude
- information (16 levels), we can see that data accumulates at a rate of 4k
- bytes per second, which is certainly acceptable. The problem comes when we
- try to play back the sound. Pulses occur at intervals of 125 microseconds,
- which doesn't seem too bad, but since each pulse can have 16 possible widths,
- it is necessary to time the pulses with a resolution of well under 8
- microseconds. This is only a couple of instruction times on a 4.77 MHz XT,
- and even on a fast 80386 it doesn't give the CPU much time between bits to
- shift bits, read and increment a pointer, check the pointer to see if it's
- done yet, etc., not to mention the difficulty of servicing unrelated
- interrupts.
-
- The search for simpler (but still usable) and less CPU-intensive methods of
- reproducing speech leads to the question of what information in the waveform
- we can discard without an unacceptable loss of intelligibility. My
- experiments with running speech signals through a graphic equalizer revealed
- that the lower-frequency components, those which are most visible to the eye
- on the oscilloscope, are actually of minimal importance in understanding
- speech. This is also demonstrated by the fact that a whisper is just as
- understandable as normal speech, but does not make use of vibrating vocal
- chords, which are the primary source of low-frequency components in the
- voice.
-
- The present digitizer circuit makes use of this observation by filtering out
- most of the low-frequency components of the sound signal. Knowing that the
- speaker cone cannot move instantaneously and serves as an approximation to
- a mechanical integrator at high audio frequencies leads to the idea of
- differentiating the input waveform. This accomplishes the following result:
- the direction of movement of the speaker cone corresponds to the direction of
- movement (derivative) of the waveform. Amplitude information is lost. As it
- turns out, this is sufficiently understandable to be worth pursuing.
-